Introduction

This data is a demo (using a selective subset out of a total of 377 species) of a data analysis done to aid the selection of a set of indicator species for a coral reef monitoring program in the Philippines. The data was used for practical aid in selecting interesting indicator species by comparing statistical differences between species occurance with functional characteristics of these species. The information was used to make informed choices on which species to select for a monitoring program, but was not intended to be used to answer any hard scientific questions.

Methods

The following analyses are data distribution analyses and analysis of variance for the different species sampled on species level. The independent variable is the site name, while the test variable is the randomised total score for each species.

The randomised total score was collected by (non-random) swims (using SCUBA) of 50 minutes on 15 different sites, with 6 samples per site. The swim was divided in 5 blocks of 10 minutes, and presence/absence for each species was recoreded per 10 minute block. The presence/absence data was randomized over the blocks, and then points awarded to each block by multiplying the presence absence for each species/sample with c(5,4,3,2,1). This is according to the Rapid Visual Census technique (Hill and Wilkinson 2004), which reasons that species found during a random swim in the first block are likely very abundant as they take less effort to find, while species only found at the last block are probably less abundant. The randomization was done because local circumstances prevented the swims to be done in a random matter. So the randomization was done post-hoc.

Per species group, first a hierarchical cluster analysis is done with Ward clustering and binary distances based on species presence/absence per sample. This results in a clustering of species based on species that often occur together. Using the R pvclust function (Suzuki and Shimodaira 2015), significant groups are chosen with alpha=0.9. The alpha is more or less subjectively chosen while doing a visual inspection of the dendrogram to result in logical cluster sizes. This does result in clusters that have a higher chance of occuring by chance, but on the other hand we think it is good enough to have a reasonable estimate for selecting our indicator species, even though it doesn’t allow hard statistical conclusions.

Since the data is count data, with count 0 being over represented, an analysis is done first on the distribution of the total score of the species/sample. The data itself is clearly not normally distributed. The variance/mean ratio is calculated. Data that follows a poisson distribution should have a variance/mean ratio of about 1. A plot is made of the observed frequencies of each score per species and the expected frequencies according to different distributions.

Different glm models are fitted for the data for each species with the randomised score as depent variable and site name as independent variable. The best fitting model is selected based on its AIC (related to loglikelihood). Using a likelihood ratio test the model is compared to the model where the relative species abundance isn’t explained by any variable, using the same type of model. If the outcome is significant, it means that the model significantly explains the observed data better than random selection of data would. So the site has influence on the relative species abundance. In this case the individual sites are compared to the overall mean and a table is shown of all model coefficients (sites) with their influence on the overall mean and whether or not that coefficient is significant (pvalue<0.05). The table is sorted by p-value.

In the case that a zero-inflated model has the best fit, there are actually two tests: one (zero) tests whether there is a significant difference between sites in the chance of having a zero-count for that species (so a higher value for that site means less likely to encounter the species). The other test (count) tests whether there is a difference between sites in the relative abundance.

If the likelihood test shows the best fitting model is not significant, nothing more is done for the GLM. If it is significant, the table of model coefficients is shown and two graphs which allow the evaluation of the assumptions for the model. If the mean (continuous line) in the plot of fitted values vs residuals is (mostly) 0, and the normal q-qplot shows the residuals to be normally distributed, the model assumptions are valid.

As a last test on the species level, a Kruskal-Wallis non-parametric test is done, to show the same effect of site name on the species abundance, using a non-parametric test, and if the Kruskal-Wallis test is significant, followed by a pairwise Dunn post-hoc test with Bonferonni adjustment, to show the difference between sites.

Finally we analysed the Shannon-Wiener species diversity for all fish species, per family and for the selected indicator species to show how well the selected indicator species represented the overall species diversity.

Results

Butterflyfish and Angelfish

Species grouping

Hierarchical Cluster Analysis investigating species grouping across samples based on it’s presence/absence. Dissimilarities are calculated using the Jaccard index. It calculates the dissimilarity between two species i and j by counting the amount of samples that have both species and divide it by the total amount of samples and substract that number from 1. Based on that an average hierachical clustering is done to group the species.

## Creating a temporary cluster...done:
## socket cluster with 3 nodes on host 'localhost'
## Multiscale bootstrap... Done.
Clustering diagram

Clustering diagram

Table of the species clusters

group species
1 pygoplites_diacanthus_pres
1 centropyge_vroliki_pres
1 centropyge_bicolor_pres
1 centropyge_tibicen_pres
1 chaetodon_adiergastos_pres
1 chaetodon_kleinii_pres
1 chaetodon_triangulum_pres
1 heniochus_varius_pres
1 chaetodon_vagabundus_pres
2 chaetodon_selene_pres
2 chaetodon_cittrinellus_pres
3 chaetodon_ulietensis_pres
3 chaetodon_bennetti_pres
3 chaetodon_lineolatus_pres
4 heniochus_diphreutes_pres
4 centropyge_bispinosus_pres
5 chaetodontoplus_mesoleucus_pres
5 pomacanthus_navarchus_pres
5 chaetodon_rafflesi_pres
5 coradion_melanopus_pres
5 chaetodon_melannotus_pres
5 chaetodon_oxycephalus_pres
5 chaetodon_ocellicaudus_pres
5 chaetodon_speculum_pres

centropyge_bicolor

Bicolor Angelfish

Data distribution

Variance/Mean Ratio: 2.1

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

## [1] "Skipping Zero-inflated poisson model due to errors"
## [1] "Skipping Zero-inflated negative binomial model due to errors"

General Linear Model fit and parameters using Poisson model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   1 -186.78                         
## 2   9 -127.86  8 117.85  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coefficient value pvalue
1 (Intercept) 2.6508918 0.000
4 data$site_nameDauin Poblacion District 1 -3.0563569 0.000
3 data$site_nameBasak -0.3996000 0.020
7 data$site_nameLutoban Pier -0.3651138 0.031
9 data$site_nameMalatapay Pier -0.3155169 0.059
5 data$site_nameGuinsuan -0.2231436 0.170
6 data$site_nameKookoo’s Nest -0.2231436 0.170
8 data$site_nameLutoban South -0.0359320 0.816
2 data$site_nameAntulang 0.0232569 0.879
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 29.587, df = 8, p-value = 0.0002501

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Dauin Poblacion District 1-Andulay 4 0.001
Dauin Poblacion District 1-Antulang 4 0.000
Lutoban South-Dauin Poblacion District 1 4 0.003

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df   LogLik Df  Chisq Pr(>Chisq)  
## 1   1 -11.5862                       
## 2   9  -4.1589  8 14.855    0.06203 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

centropyge_tibicen

Keyhole Angelfish

Data distribution

Variance/Mean Ratio: 3.8

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

General Linear Model fit and parameters using Poisson model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   1 -214.97                         
## 2   9 -116.51  8 196.92  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coefficient value pvalue
2 data$site_nameAntulang 1.5040774 0.000
3 data$site_nameBasak 2.0476928 0.000
5 data$site_nameGuinsuan 2.3864666 0.000
7 data$site_nameLutoban Pier 1.7707061 0.000
8 data$site_nameLutoban South 2.2380466 0.000
9 data$site_nameMalatapay Pier 2.1400662 0.000
6 data$site_nameKookoo’s Nest 0.9162907 0.028
4 data$site_nameDauin Poblacion District 1 -0.9808293 0.147
1 (Intercept) 0.2876821 0.416
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 44.031, df = 8, p-value = 5.613e-07

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Dauin Poblacion District 1-Basak 3 0.049
Guinsuan-Andulay 4 0.000
Guinsuan-Dauin Poblacion District 1 5 0.000
Kookoo’s Nest-Guinsuan 4 0.010
Lutoban South-Andulay 4 0.008
Lutoban South-Dauin Poblacion District 1 4 0.002
Malatapay Pier-Andulay 3 0.042
Malatapay Pier-Dauin Poblacion District 1 4 0.013

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df   LogLik Df Chisq Pr(>Chisq)    
## 1   1 -22.6521                        
## 2   9  -6.8623  8 31.58  0.0001107 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

P-values for model parameters

Pr(>|z|)
(Intercept) 1.000
Antulang 0.999
Basak 0.999
Dauin Poblacion District 1 0.239
Guinsuan 0.999
Kookoo’s Nest 0.999
Lutoban Pier 0.999
Lutoban South 0.999
Malatapay Pier 0.999
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Normal Q-Q plot for the model residuals.

Normal Q-Q plot for the model residuals.

chaetodon_adiergastos

Panda Butterflyfish

Data distribution

Variance/Mean Ratio: 2.3

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

General Linear Model fit and parameters using Gaussian model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df  LogLik Df  Chisq Pr(>Chisq)  
## 1   2 -157.69                       
## 2  10 -149.11  8 17.159    0.02849 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coefficient value pvalue
1 (Intercept) 9.6666667 0.000
4 data$site_nameDauin Poblacion District 1 -5.1666667 0.033
2 data$site_nameAntulang 3.3333333 0.169
5 data$site_nameGuinsuan -2.6666667 0.271
6 data$site_nameKookoo’s Nest -2.5000000 0.302
7 data$site_nameLutoban Pier -1.8333333 0.449
3 data$site_nameBasak 1.3333333 0.582
8 data$site_nameLutoban South -0.3333333 0.890
9 data$site_nameMalatapay Pier -0.1666667 0.945
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 15.015, df = 8, p-value = 0.05885

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df  LogLik Df  Chisq Pr(>Chisq)
## 1   1 -8.5542                     
## 2   9 -5.4067  8 6.2949     0.6142

chaetodon_punctatofasciatus

Spot-Banded Butterflyfish

Data distribution

Variance/Mean Ratio: 6.1

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

## [1] "Skipping Zero-inflated poisson model due to errors"
## [1] "Skipping Zero-inflated negative binomial model due to errors"

General Linear Model fit and parameters using Poisson model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   1 -269.15                         
## 2   9 -116.10  8 306.11  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coefficient value pvalue
1 (Intercept) 1.1526795 0.000
4 data$site_nameDauin Poblacion District 1 1.3993664 0.000
6 data$site_nameKookoo’s Nest 1.5553707 0.000
8 data$site_nameLutoban South 1.4375877 0.000
3 data$site_nameBasak -1.8458267 0.003
2 data$site_nameAntulang -1.1526795 0.014
7 data$site_nameLutoban Pier 0.5520686 0.055
9 data$site_nameMalatapay Pier 0.3136576 0.299
5 data$site_nameGuinsuan -18.4552646 0.990
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 43.231, df = 8, p-value = 7.948e-07

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Dauin Poblacion District 1-Basak 3 0.027
Guinsuan-Dauin Poblacion District 1 4 0.010
Kookoo’s Nest-Andulay 3 0.049
Kookoo’s Nest-Antulang 4 0.005
Kookoo’s Nest-Basak 4 0.002
Kookoo’s Nest-Guinsuan 4 0.001
Lutoban South-Antulang 3 0.029
Lutoban South-Basak 4 0.010
Lutoban South-Guinsuan 4 0.004

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   1 -35.594                         
## 2   9 -16.088  8 39.012   4.89e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

P-values for model parameters

Pr(>|z|)
(Intercept) 1.000
Antulang 0.560
Basak 0.239
Dauin Poblacion District 1 0.996
Guinsuan 0.996
Kookoo’s Nest 0.996
Lutoban Pier 0.239
Lutoban South 0.996
Malatapay Pier 0.239
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Normal Q-Q plot for the model residuals.

Normal Q-Q plot for the model residuals.

chaetodon_triangulum

Triangular Butterflyfish

Data distribution

Variance/Mean Ratio: 3.4

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

General Linear Model fit and parameters using Negative Binomial model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   2 -159.06                         
## 2  10 -135.94  8 46.241  2.139e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coefficient value pvalue
1 (Intercept) 1.6422277 0.000
6 data$site_nameKookoo’s Nest 0.9227216 0.000
9 data$site_nameMalatapay Pier -1.2367626 0.003
4 data$site_nameDauin Poblacion District 1 0.7556675 0.005
2 data$site_nameAntulang -0.9490806 0.011
5 data$site_nameGuinsuan 0.6768867 0.012
3 data$site_nameBasak 0.2548922 0.373
8 data$site_nameLutoban South 0.2548922 0.373
7 data$site_nameLutoban Pier 0.1495317 0.607
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 31.583, df = 8, p-value = 0.0001106

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Dauin Poblacion District 1-Antulang 3 0.028
Kookoo’s Nest-Antulang 4 0.003
Malatapay Pier-Dauin Poblacion District 1 4 0.017
Malatapay Pier-Kookoo’s Nest 4 0.002

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df  LogLik Df  Chisq Pr(>Chisq)  
## 1   1 -18.837                       
## 2   9 -10.681  8 16.311    0.03814 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

P-values for model parameters

Pr(>|z|)
(Intercept) 0.998
Antulang 0.998
Basak 0.998
Dauin Poblacion District 1 1.000
Guinsuan 1.000
Kookoo’s Nest 1.000
Lutoban Pier 1.000
Lutoban South 1.000
Malatapay Pier 0.998
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Normal Q-Q plot for the model residuals.

Normal Q-Q plot for the model residuals.

Parrotfish, Groupers, Sweetlips, Rabbitfish, Snappers

Species grouping

Hierarchical Cluster Analysis investigating species grouping across samples based on it’s presence/absence. Dissimilarities are calculated using the Jaccard index. It calculates the dissimilarity between two species i and j by counting the amount of samples that have both species and divide it by the total amount of samples and substract that number from 1. Based on that an average hierachical clustering is done to group the species.

## Creating a temporary cluster...done:
## socket cluster with 3 nodes on host 'localhost'
## Multiscale bootstrap... Done.
Clustering diagram

Clustering diagram

Table of the species clusters

group species
1 unknown_pres
1 unknown_pres.1
2 epinephelus_fasciatus_pres
2 lutjanus_decussatus_pres
2 lutjanus_fulvus_pres
2 cephalopholis_argus_pres
2 scarus_niger_pres
2 macolor_macularis_juv_pres
3 plectorhinchus_polytaenia_pres
3 lutjanus_monostigma_pres
4 chlorurus_sordidus_ip_pres
4 scarus_dimidiatus_ip_pres
5 siganus_virgatus_pres
5 chlorurus_microrhinos_pres
6 plectorhinchus_vittatus_pres
6 plectorhinchus_lineatus_pres
7 diplorion_bifasciatum_pres
7 cephalopholis_cyanostigma_pres
8 siganus_punctatissimus_pres
8 scarus_hypselopterus_pres
9 epinephelus_ongus_juv_pres
9 variola_louti_juv_pres
10 plectorhinchus_lessonii_pres
10 plectorhinchus_chaetodonoides_pres
11 gracila_albomarginata_juv_pres
11 gracila_albomarginata_pres
12 scarus_schlegeli_ip_pres
12 scarus_psittacus_pres
13 lutjanus_quinquelineatus_pres
13 lutjanus_vitta_pres
14 cephalopholis_miniata_pres
14 lutjanus_rivulatus_pres
15 scarus_ghobban_ip_pres
15 scarus_forsteni_ip_pres
16 lutjanus_biguttatus_pres
16 chlorurus_bleekeri_ip_pres
16 scarus_flavipectoralis_ip_pres
16 cephalopholis_microprion_pres
16 variola_louti_pres
17 cetoscarus_ocellatus_ip_pres
17 variola_albimarginata_juv_pres
18 diagramma_pictum_juv_pres
18 diagramma_pictum_sub-adult_pres
18 plectorhinchus_vittatus_juv_pres

cephalopholis_argus

Peacock Grouper

Data distribution

Variance/Mean Ratio: 2.9

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

General Linear Model fit and parameters using Gaussian model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   2 -138.74                         
## 2  10 -111.13  8 55.206  4.027e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coefficient value pvalue
1 (Intercept) 14.8 0.000
5 data$site_nameGuinsuan -14.0 0.000
7 data$site_nameLutoban Pier -8.8 0.000
9 data$site_nameMalatapay Pier -7.2 0.000
8 data$site_nameLutoban South -6.8 0.001
3 data$site_nameBasak -3.4 0.093
4 data$site_nameDauin Poblacion District 1 -1.4 0.489
2 data$site_nameAntulang -1.0 0.621
6 data$site_nameKookoo’s Nest -0.8 0.692
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 30.668, df = 8, p-value = 0.0001609

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Guinsuan-Andulay 4 0.003
Guinsuan-Antulang 3 0.035
Kookoo’s Nest-Guinsuan 4 0.008

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df   LogLik Df  Chisq Pr(>Chisq)  
## 1   1 -15.6974                       
## 2   9  -8.3691  8 14.657    0.06617 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

epinephelus_fasciatus

Blacktip Grouper

Data distribution

Variance/Mean Ratio: 1.5

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

## [1] "Skipping Zero-inflated poisson model due to errors"
## [1] "Skipping Zero-inflated negative binomial model due to errors"

General Linear Model fit and parameters using Gaussian model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   2 -126.17                         
## 2  10 -108.71  8 34.922  2.763e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coefficient value pvalue
1 (Intercept) 15.0 0.000
4 data$site_nameDauin Poblacion District 1 -7.4 0.000
6 data$site_nameKookoo’s Nest -8.6 0.000
7 data$site_nameLutoban Pier -7.0 0.000
9 data$site_nameMalatapay Pier -4.2 0.028
3 data$site_nameBasak -2.6 0.175
8 data$site_nameLutoban South -2.2 0.251
5 data$site_nameGuinsuan -1.6 0.404
2 data$site_nameAntulang -1.0 0.602
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 25.076, df = 8, p-value = 0.001509

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Dauin Poblacion District 1-Andulay 3 0.039
Kookoo’s Nest-Andulay 3 0.019

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df      LogLik Df Chisq Pr(>Chisq)
## 1   1 -1.3053e-10                    
## 2   9 -1.3053e-10  8     0          1

siganus_guttatus

Golden Rabbitfish

Data distribution

Variance/Mean Ratio: 8.4

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

## [1] "Skipping Zero-inflated poisson model due to errors"
## [1] "Skipping Zero-inflated negative binomial model due to errors"

General Linear Model fit and parameters using Poisson model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df   LogLik Df  Chisq Pr(>Chisq)    
## 1   1 -196.509                         
## 2   9  -42.779  8 307.46  < 2.2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coefficient value pvalue
3 data$site_nameBasak 22.6821314 0.997
4 data$site_nameDauin Poblacion District 1 22.8040212 0.997
1 (Intercept) -20.3025853 0.998
2 data$site_nameAntulang 19.3862945 0.998
6 data$site_nameKookoo’s Nest 21.4657361 0.998
5 data$site_nameGuinsuan 0.0000002 1.000
7 data$site_nameLutoban Pier 0.0000002 1.000
8 data$site_nameLutoban South 0.0000002 1.000
9 data$site_nameMalatapay Pier 0.0000002 1.000
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 39.008, df = 8, p-value = 4.898e-06

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Basak-Andulay 3 0.019
Dauin Poblacion District 1-Andulay 4 0.009
Guinsuan-Basak 3 0.019
Guinsuan-Dauin Poblacion District 1 4 0.009
Lutoban Pier-Basak 3 0.019
Lutoban Pier-Dauin Poblacion District 1 4 0.009
Lutoban South-Basak 3 0.019
Lutoban South-Dauin Poblacion District 1 4 0.009
Malatapay Pier-Basak 3 0.019
Malatapay Pier-Dauin Poblacion District 1 4 0.009

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   1 -28.643                         
## 2   9  -5.004  8 47.278  1.357e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

P-values for model parameters

Pr(>|z|)
(Intercept) 0.999
Antulang 0.999
Basak 0.998
Dauin Poblacion District 1 0.998
Guinsuan 1.000
Kookoo’s Nest 0.999
Lutoban Pier 1.000
Lutoban South 1.000
Malatapay Pier 1.000
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Normal Q-Q plot for the model residuals.

Normal Q-Q plot for the model residuals.

Surgeonfish, Triggerfish, Filefish, Puffers, Boxfish, Fusiliers, Coral Breams, Emperors, Tobies

Species grouping

Hierarchical Cluster Analysis investigating species grouping across samples based on it’s presence/absence. Dissimilarities are calculated using the Jaccard index. It calculates the dissimilarity between two species i and j by counting the amount of samples that have both species and divide it by the total amount of samples and substract that number from 1. Based on that an average hierachical clustering is done to group the species.

## Creating a temporary cluster...done:
## socket cluster with 3 nodes on host 'localhost'
## Multiscale bootstrap... Done.
Clustering diagram

Clustering diagram

Table of the species clusters

group species
1 lethrinus_erythracanthus_pres
1 acanthurus_pyroferus_juv_pres
2 acanthurus_nigrofuscus_pres
2 acanthurus_mata_pres
2 ctenochaetus_binotatus_pres
2 naso_minor_pres
2 zebrasoma_scopas_pres
2 balistapus_undulatus_pres
2 balistoides_viridescens_pres
2 sufflamen_bursa_pres
2 pterocaesio_pisang_pres
2 scolopsis_bilineata_pres
2 arothron_nigropuncatus_pres
2 canthigaster_valentini_pres
2 canthigaster_papua_pres
2 acanthurus_pyroferus_pres
3 scolopsis_affinis_pres
3 sufflamen_chrysopterus_pres
3 scolopsis_affinis_juv_pres
4 ctenochaetus_tominiensis_pres
4 ostracion_solorensis_pres
4 acanthurus_thompsoni_pres
4 ctenochaetus_cyanocheilus_pres
5 monotaxis_heterodon_pres
5 lethrinus_obsoletus_pres
6 diodon_holocanthus_pres
6 naso_thynoides_pres
7 lethrinus_erythracanthus_juv_pres
7 lethrinus_erythropterus_pres
8 naso_hexacanthus_pres
8 gymnocranius_microdon_pres
9 pseudaluttarius_nasicornis_pres
9 arothron_manilensis_pres
10 zebrasoma_flavescens_pres
10 caesio_lunaris_pres
10 ctenochaetus_binotatus_juv_pres
10 paracanthurus_hepatus_pres

arothron_nigropuncatus

Blackspotted Puffer

Data distribution

Variance/Mean Ratio: 3.2

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

General Linear Model fit and parameters using Gaussian model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   2 -159.90                         
## 2  10 -146.81  8 26.179  0.0009786 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coefficient value pvalue
1 (Intercept) 6.3333333 0.000
4 data$site_nameDauin Poblacion District 1 6.0000000 0.010
6 data$site_nameKookoo’s Nest 4.8333333 0.037
9 data$site_nameMalatapay Pier -3.1666667 0.172
8 data$site_nameLutoban South -2.6666667 0.250
7 data$site_nameLutoban Pier 1.1666667 0.615
5 data$site_nameGuinsuan 1.0000000 0.666
2 data$site_nameAntulang -0.6666667 0.774
3 data$site_nameBasak -0.5000000 0.829
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 18.613, df = 8, p-value = 0.01707

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Malatapay Pier-Dauin Poblacion District 1 3 0.045

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df  LogLik Df  Chisq Pr(>Chisq)
## 1   1 -16.659                     
## 2   9 -10.341  8 12.634     0.1251

balistoides_viridescens

Titan Triggerfish

Data distribution

Variance/Mean Ratio: 2.9

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

General Linear Model fit and parameters using Zero-Inflated Poisson model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   2 -143.62                         
## 2  18 -117.73 16 51.766  1.195e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
group coefficient value pvalue
1 count (Intercept) 1.8971200 0.000
8 count data$site_nameLutoban South -1.4311096 0.006
3 count data$site_nameBasak 0.4219944 0.045
7 count data$site_nameLutoban Pier -0.4286010 0.122
5 count data$site_nameGuinsuan -0.4048218 0.170
2 count data$site_nameAntulang -0.3215836 0.196
4 count data$site_nameDauin Poblacion District 1 0.1392919 0.598
9 count data$site_nameMalatapay Pier -0.0746284 0.758
6 count data$site_nameKookoo’s Nest -0.0512934 0.822
10 zero (Intercept) -19.7445376 0.998
13 zero data$site_nameDauin Poblacion District 1 19.7435969 0.998
14 zero data$site_nameGuinsuan 19.0156279 0.998
16 zero data$site_nameLutoban Pier 18.0538928 0.998
17 zero data$site_nameLutoban South 18.1108992 0.998
18 zero data$site_nameMalatapay Pier 18.1226874 0.998
11 zero data$site_nameAntulang -0.0000002 1.000
12 zero data$site_nameBasak -0.0000002 1.000
15 zero data$site_nameKookoo’s Nest -0.0000002 1.000
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 19.788, df = 8, p-value = 0.01117

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Lutoban South-Basak 4 0.007

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df  LogLik Df  Chisq Pr(>Chisq)  
## 1   1 -24.330                       
## 2   9 -17.204  8 14.253     0.0754 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

lethrinus_ornatus

Ornate Emperor

Data distribution

Variance/Mean Ratio: 5.9

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

General Linear Model fit and parameters using Zero-Inflated Poisson model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   2 -149.75                         
## 2  18 -100.22 16 99.054  5.208e-14 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
group coefficient value pvalue
1 count (Intercept) 2.6625879 0.000
5 count data$site_nameGuinsuan -1.5298305 0.000
7 count data$site_nameLutoban Pier -1.2963203 0.000
3 count data$site_nameBasak -0.6931472 0.001
2 count data$site_nameAntulang -0.5031037 0.007
8 count data$site_nameLutoban South -1.6253411 0.016
9 count data$site_nameMalatapay Pier -1.2963185 0.020
4 count data$site_nameDauin Poblacion District 1 -1.0601512 0.030
6 count data$site_nameKookoo’s Nest -0.2954643 0.082
10 zero (Intercept) -19.6146680 0.998
13 zero data$site_nameDauin Poblacion District 1 21.2156999 0.998
14 zero data$site_nameGuinsuan 18.7769782 0.998
16 zero data$site_nameLutoban Pier 19.5742068 0.998
17 zero data$site_nameLutoban South 21.1499991 0.998
18 zero data$site_nameMalatapay Pier 21.2000263 0.998
11 zero data$site_nameAntulang -0.0000001 1.000
12 zero data$site_nameBasak -0.0000001 1.000
15 zero data$site_nameKookoo’s Nest -0.0000001 1.000
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 39.83, df = 8, p-value = 3.447e-06

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Dauin Poblacion District 1-Andulay 4 0.002
Guinsuan-Andulay 3 0.050
Lutoban Pier-Andulay 3 0.025
Lutoban South-Andulay 4 0.001
Lutoban South-Kookoo’s Nest 3 0.032
Malatapay Pier-Andulay 4 0.001
Malatapay Pier-Kookoo’s Nest 3 0.042

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   1 -35.594                         
## 2   9 -16.088  8 39.012   4.89e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

P-values for model parameters

Pr(>|z|)
(Intercept) 0.996
Antulang 1.000
Basak 1.000
Dauin Poblacion District 1 0.996
Guinsuan 0.997
Kookoo’s Nest 1.000
Lutoban Pier 0.996
Lutoban South 0.996
Malatapay Pier 0.996
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Normal Q-Q plot for the model residuals.

Normal Q-Q plot for the model residuals.

pterocaesio_tile

Bluestreak Fusilier

Data distribution

Variance/Mean Ratio: 4.8

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

## [1] "Skipping Zero-inflated poisson model due to errors"
## [1] "Skipping Zero-inflated negative binomial model due to errors"

General Linear Model fit and parameters using Negative Binomial model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df   LogLik Df  Chisq Pr(>Chisq)    
## 1   2 -105.924                         
## 2  10  -88.581  8 34.685  3.052e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coefficient value pvalue
2 data$site_nameAntulang 1.4087672 0.032
8 data$site_nameLutoban South -1.7047481 0.074
3 data$site_nameBasak 1.1284653 0.088
1 (Intercept) 0.6061358 0.225
7 data$site_nameLutoban Pier 0.7801586 0.245
5 data$site_nameGuinsuan 0.3101549 0.653
4 data$site_nameDauin Poblacion District 1 -0.2006707 0.781
6 data$site_nameKookoo’s Nest -19.9087209 0.996
9 data$site_nameMalatapay Pier -19.9087209 0.996
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 28.65, df = 8, p-value = 0.0003651

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Kookoo’s Nest-Antulang 4 0.007
Lutoban South-Antulang 3 0.024
Malatapay Pier-Antulang 4 0.007

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   1 -37.096                         
## 2   9 -19.907  8 34.377   3.47e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

P-values for model parameters

Pr(>|z|)
(Intercept) 1.000
Antulang 0.996
Basak 0.239
Dauin Poblacion District 1 0.239
Guinsuan 0.560
Kookoo’s Nest 0.996
Lutoban Pier 0.560
Lutoban South 0.239
Malatapay Pier 0.996
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Normal Q-Q plot for the model residuals.

Normal Q-Q plot for the model residuals.

Wrassess and Goatfish

Species grouping

Hierarchical Cluster Analysis investigating species grouping across samples based on it’s presence/absence. Dissimilarities are calculated using the Jaccard index. It calculates the dissimilarity between two species i and j by counting the amount of samples that have both species and divide it by the total amount of samples and substract that number from 1. Based on that an average hierachical clustering is done to group the species.

## Creating a temporary cluster...done:
## socket cluster with 3 nodes on host 'localhost'
## Multiscale bootstrap... Done.
Clustering diagram

Clustering diagram

Table of the species clusters

group species
1 wetmorella_albofasciata_pres
1 anampses_geographicus_pres
2 coris_batuensis_pres
2 cirrhilabrus_cyanopleura_pres
2 cirrhilabrus_cyanopleura_juv_pres
2 halichoeres_prosopeion_pres
2 halichoeres_hortulanus_pres
2 oxycheilinus_digrammus_pres
2 thalassoma_lunare_pres
2 stethojulis_interrupta_pres
2 pseudocheilinus_evanidus_pres
2 bodianus_mesothorax_pres
2 labrioides_dimidiatus_pres
2 labrioides_dimidiatus_juv_pres
2 parupeneus_multifasciatus_pres
2 parupeneus_multifasciatus_juv_pres
2 parupeneus_barberinus_pres
2 parupeneus_barberinus_juv_pres
2 thalassoma_lunare_juv_pres
3 halichoeres_zeylonicus_pres
3 coris_gaimard_pres
4 cheilinus_chlorourus_pres
4 oxycheilinus_celebicus_pres
5 parupeneus_crassilabris_juv_pres
5 stethojulis_trilineata_pres
6 gomphosus_varius_pres
6 thalassoma_hardwicke_pres
6 gomphosus_varius_juv_pres
7 hemigymnus_melapterus_pres
7 halichoeres_melanochir_pres
8 cheilio_inermis_pres
8 parupeneus_barberinoides_juv_pres
9 bodianus_mesothorax_juv_pres
9 pseudodax_mollocanus_pres
10 halichoeres_richmondi_pres
10 cheilinus_fasciatus_pres
10 labrichthys_unileatus_juv_pres
11 anampses_melanurus_pres
11 anampses_melanurus_juv_pres
12 halichoeres_scapularis_juv_pres
12 hologymnosus_annulatus_pres
13 hemigymnus_melapterus_juv_pres
13 labropsis_alleni_pres
14 mulloidichthys_vanicolensis_pres
14 pseudocoris_bleekeri_pres
15 coris_batuensis_juv_pres
15 stethojulis_trilineata_juv_pres
16 halichoeres_podostigma_juv_pres
16 anampses_twistii_pres

bodianus_dictynna

Redfin Hogfish

Data distribution

Variance/Mean Ratio: 3.6

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

General Linear Model fit and parameters using Zero-Inflated Poisson model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   2 -158.26                         
## 2  18 -122.18 16 72.167  4.152e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
group coefficient value pvalue
1 count (Intercept) 2.2335922 0.000
3 count data$site_nameBasak 0.3813677 0.034
6 count data$site_nameKookoo’s Nest -0.4700032 0.036
4 count data$site_nameDauin Poblacion District 1 -0.4788207 0.045
9 count data$site_nameMalatapay Pier -0.8673259 0.116
5 count data$site_nameGuinsuan 0.2787133 0.124
7 count data$site_nameLutoban Pier -0.2886007 0.342
8 count data$site_nameLutoban South -0.1541504 0.438
2 count data$site_nameAntulang -0.1133289 0.564
10 zero (Intercept) -20.5770960 0.999
13 zero data$site_nameDauin Poblacion District 1 18.9489987 0.999
16 zero data$site_nameLutoban Pier 21.2688750 0.999
18 zero data$site_nameMalatapay Pier 22.1624518 0.999
11 zero data$site_nameAntulang -0.0000001 1.000
12 zero data$site_nameBasak -0.0000001 1.000
14 zero data$site_nameGuinsuan -0.0000001 1.000
15 zero data$site_nameKookoo’s Nest -0.0000001 1.000
17 zero data$site_nameLutoban South -0.0000001 1.000
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 33.956, df = 8, p-value = 4.137e-05

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Lutoban Pier-Basak 4 0.005
Lutoban Pier-Guinsuan 3 0.027
Malatapay Pier-Basak 4 0.000
Malatapay Pier-Guinsuan 4 0.003

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df   LogLik Df  Chisq Pr(>Chisq)    
## 1   1 -25.8749                         
## 2   9  -9.2258  8 33.298  5.441e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

P-values for model parameters

Pr(>|z|)
(Intercept) 0.998
Antulang 1.000
Basak 1.000
Dauin Poblacion District 1 0.998
Guinsuan 1.000
Kookoo’s Nest 1.000
Lutoban Pier 0.998
Lutoban South 1.000
Malatapay Pier 0.998
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Normal Q-Q plot for the model residuals.

Normal Q-Q plot for the model residuals.

macropharyngodon_meleagris

Leopard Wrasse

Data distribution

Variance/Mean Ratio: 4.7

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

## [1] "Skipping Zero-inflated poisson model due to errors"
## [1] "Skipping Zero-inflated negative binomial model due to errors"

General Linear Model fit and parameters using Negative Binomial model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   2 -118.17                         
## 2  10 -104.88  8 26.584   0.000834 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coefficient value pvalue
1 (Intercept) 1.8458267 0.000
7 data$site_nameLutoban Pier -2.2512918 0.003
5 data$site_nameGuinsuan -1.4403616 0.033
9 data$site_nameMalatapay Pier -1.2396909 0.060
8 data$site_nameLutoban South -0.9985288 0.121
4 data$site_nameDauin Poblacion District 1 -0.8043728 0.205
2 data$site_nameAntulang -0.4595323 0.459
3 data$site_nameBasak 0.2744368 0.649
6 data$site_nameKookoo’s Nest -21.1484118 0.996
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 23.091, df = 8, p-value = 0.00325

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Kookoo’s Nest-Andulay 3 0.036
Kookoo’s Nest-Basak 4 0.016

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df  LogLik Df Chisq Pr(>Chisq)    
## 1   1 -37.282                        
## 2   9 -23.386  8 27.79  0.0005158 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

P-values for model parameters

Pr(>|z|)
(Intercept) 0.994
Antulang 0.995
Basak 0.995
Dauin Poblacion District 1 0.995
Guinsuan 0.994
Kookoo’s Nest 0.992
Lutoban Pier 0.994
Lutoban South 0.995
Malatapay Pier 0.994
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Normal Q-Q plot for the model residuals.

Normal Q-Q plot for the model residuals.

oxycheilinus_digrammus

Linedcheeked Wrasse

Data distribution

Variance/Mean Ratio: 2.7

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

General Linear Model fit and parameters using Gaussian model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   2 -162.78                         
## 2  10 -138.75  8 48.056  9.641e-08 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coefficient value pvalue
1 (Intercept) 12.3333333 0.000
5 data$site_nameGuinsuan -9.6666667 0.000
9 data$site_nameMalatapay Pier -8.1666667 0.000
3 data$site_nameBasak -6.6666667 0.001
6 data$site_nameKookoo’s Nest -2.0000000 0.317
4 data$site_nameDauin Poblacion District 1 -1.5000000 0.453
2 data$site_nameAntulang 1.3333333 0.505
7 data$site_nameLutoban Pier -0.5000000 0.802
8 data$site_nameLutoban South -0.3333333 0.868
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 28.113, df = 8, p-value = 0.0004533

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Guinsuan-Antulang 4 0.005
Malatapay Pier-Antulang 3 0.045

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df  LogLik Df  Chisq Pr(>Chisq)  
## 1   1 -18.837                       
## 2   9 -10.681  8 16.311    0.03814 *
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

P-values for model parameters

Pr(>|z|)
(Intercept) 0.998
Antulang 1.000
Basak 0.998
Dauin Poblacion District 1 1.000
Guinsuan 0.998
Kookoo’s Nest 1.000
Lutoban Pier 1.000
Lutoban South 1.000
Malatapay Pier 0.998
Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Boxplot of the predicted values and confidence interval of the change of encountering the species per site.

Normal Q-Q plot for the model residuals.

Normal Q-Q plot for the model residuals.

stethojulis_interrupta

Cutribbon Wrasse

Data distribution

Variance/Mean Ratio: 2.9

Observed frequencies of the total score vs expected frequencies of different distributions.

Observed frequencies of the total score vs expected frequencies of different distributions.

Boxplot of the relative abundances per site. Boxplot of the relative abundances per site, excluding 0-counts.

Relative Abundance models

General Linear Model fit and parameters using Negative Binomial model

## Likelihood ratio test
## 
## Model 1: update(models$bestfit, data$y ~ 1)
## Model 2: models$bestfit
##   #Df  LogLik Df  Chisq Pr(>Chisq)    
## 1   2 -161.05                         
## 2  10 -140.32  8 41.448  1.719e-06 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
coefficient value pvalue
1 (Intercept) 2.3353749 0.000
6 data$site_nameKookoo’s Nest -1.2367626 0.000
7 data$site_nameLutoban Pier -2.1812242 0.000
4 data$site_nameDauin Poblacion District 1 -0.3894648 0.142
3 data$site_nameBasak -0.2353141 0.363
2 data$site_nameAntulang -0.1381503 0.588
5 data$site_nameGuinsuan -0.1017827 0.688
8 data$site_nameLutoban South -0.0666914 0.792
9 data$site_nameMalatapay Pier 0.0000000 1.000
Plots for evaluation of model conditions.

Plots for evaluation of model conditions.

Relative abundance Kruskal-Wallis

Krusall-Wallis test, making no assumptions on the data distribution.

## 
##  Kruskal-Wallis rank sum test
## 
## data:  data$y by data$site_name
## Kruskal-Wallis chi-squared = 23.601, df = 8, p-value = 0.002672

Dunn (with bonferroni p-value adjustment) post-hoc test results after Kruskall-Wallis test, showing only significant (p-adj<0.05) results.

site difference p_adj
Lutoban Pier-Andulay 3 0.042
Lutoban South-Lutoban Pier 3 0.048
Malatapay Pier-Lutoban Pier 3 0.034

Presence-Absence model

Logistic regression testing the chance the species is encountered at each site.

## Likelihood ratio test
## 
## Model 1: data$y ~ 1
## Model 2: y ~ site_name
##   #Df   LogLik Df  Chisq Pr(>Chisq)
## 1   1 -14.2588                     
## 2   9  -9.2258  8 10.066     0.2604
## [1] "Unkown indicator species:"
## [1] "cheilodipterus_quinquelineatus" "fistularia_commersonii"        
## [3] "myripristis_botche"             "naso_unicornis"                
## [5] "platax_pinnatus"                "pterois_volitans"              
## [7] "cheilinus_undulatus"            "labrichthys_unileatus"

Species diversities per site and family

family site div
815 Indicator Species Guinsuan 3.793934
818 Indicator Species Lutoban South 3.884768
819 Indicator Species Malatapay Pier 3.887487
817 Indicator Species Lutoban Pier 3.912158
816 Indicator Species Kookoo’s Nest 3.952451
813 Indicator Species Basak 3.972587
811 Indicator Species Andulay 3.987899
812 Indicator Species Antulang 4.004278
814 Indicator Species Dauin Poblacion District 1 4.095858
826 Total Lutoban Pier 4.888411
827 Total Lutoban South 4.894894
824 Total Guinsuan 4.907122
825 Total Kookoo’s Nest 4.965014
828 Total Malatapay Pier 4.973251
820 Total Andulay 5.042729
821 Total Antulang 5.111224
822 Total Basak 5.130003
823 Total Dauin Poblacion District 1 5.163217

References

Hill, Josh, and Clive Wilkinson. 2004. “Methods for ecological monitoring of coral reefs.” Australian Institute of Marine Science, Townsville, 117. doi:10.1017/CBO9781107415324.004.

Suzuki, R., and H. Shimodaira. 2015. “Package ‘ pvclust ’.” R Topics Documented, 14. http://www.sigmath.es.osaka-u.ac.jp/shimo-lab/prog/pvclust/.